Vertex AI Experiments

https://gyazo.com/580f64e7b3abb1ee5dd1acfa47d00f74

Vertex AI Experiments の概要 | Google Cloud

Vertex AI Experiments でテストを追跡、比較、管理する | Google Cloud 公式ブログ

Class Experiment (1.56.0) | Python client library | Google Cloud

Class ExperimentRun (1.56.0) | Python client library | Google Cloud

概念

Vertex AI Experiments の用語 - Vertex AI Experiments の概要 | Google Cloud

日本語 UI と英語でまあまあ違う...

テスト(experiment)

テスト実行(experiment run)

パラメータ(parameter)

指標(metrics)

実行(Execution)

アーティファクト(artifact)

実行は、データの前処理、トレーニング、モデル評価（ただし、これらに限定されない）を含む ML ワークフローのステップ

Vertex AI Metadata はより抽象的なデータストアで、それの実験用か

ExperimentRun は Execution の1種?

googleapis/python-aiplatform@main - google/cloud/aiplatform/metadata/experiment_resources.py#L820-L823

table:metadata

MetadataType Schema VertexAI SDK Class

Context system.PipelineRun aiplatform.PipelineJob

Context system.ExperimentRun aiplatform.ExperimentRun

Execution system.Run aiplatform.ExperimentRun

Experiment は?

VertexAI Metadata の schema て何

Vertex ML Metadata の概要 | Vertex AI | Google Cloud

システムスキーマ | Vertex AI | Google Cloud

system.Dataset

system.Artifact

system.Model

system.Metrics

データモデルとリソース | Vertex AI | Google Cloud

公開されているメタデータリソースは、ML Metadata（MLMD）のオープンソース実装のものとほぼ同じです。

google/ml-metadata: For recording and retrieving metadata associated with ML developer and data scientist workflows.

これとおなじ？

https://cloud.google.com/vertex-ai/docs/pipelines/artifact-types?hl=ja

基本的な記録

とりあえずしばらくはパラメータいじりつつモデルの評価をやる際の記録場所として使うので学習中の指標は興味ない

テストを作成または削除する | Vertex AI | Google Cloud

テスト実行を作成して管理する | Vertex AI | Google Cloud

テスト実行にデータを手動で記録する | Vertex AI | Google Cloud

Vertex AI Experiments ではじめる機械学習モデルの実験管理 Python - Qiita

get_started_with_vertex_experiments.ipynb - Colab

python-aiplatform/samples/model-builder/experiment_tracking at main · googleapis/python-aiplatform

code:sample.py

from google.cloud import aiplatform

aiplatform.init(

experiment=experiment_name,

experiment_description=experiment_description,

experiment_tensorboard=False,

project=project,

location=location,

)

with aiplatform.start_run('') as run:

run.log_params({'key': value})

run.log_metrics({'score': 123})

init 色々やりすぎ

googleapis/python-aiplatform@bc8b14a - google/cloud/aiplatform/initializer.py#L112

run の中で log_params 何回呼んでもよい、マージされる、後勝ち

デフォルトでは同じ名前の run を作るとエラーになる、start_run(name, resume=True) で追記(上書き)可能

初回から resume=True はエラー

params, metrics に渡せるのは単純なスカラのみ、float, int, str

flat にする util とかないのかな?

log_params と log_metrics はまあ当然別空間、同じキーあってもくっつかない

metrics や artifact 消したい

繰り返し実行すると都度記録されてしまう、そりゃそうだ

Vertex ML Metadata で検索して消す、run は execution かと思いきや context

parameter や metric を消す方法は無さそう、run ごと葬るしか無い

code:delete.py

with aiplatform.start_run(...) as run:

af.delete() for af in run.get_artifacts()

# 最初↓を書いていた

with aiplatform.start_run(...) as run:

artifacts = aiplatform.Artifact.list(filter=f'in_context("{run.resource_name}")')

for artifact in artifacts:

artifact.delete()

ClassificationMetrics の記録

code:classification.py

labels = True, False

run.log_classification_metrics(

display_name="foobar",

labels=str(la) for la in labels, # str じゃないといけない

matrix=metrics.confusion_matrix(

all_evals"repetition", all_evals"predict", labels=labels

).tolist(), # np.array はだめ

# ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() のようなエラーが出る

)

run と artifact どう対応させる?

run は artifact 持てない (log_model はできる) ので

GoogleCloudPlatform/vertex-ai-samples@main - notebooks/official/experiments/get_started_with_vertex_experiments.ipynb

start_run した中で start_execution して execution に artifact 紐づける

exec.assing_input_artifacts([dataset_artifact])

exec.assing_output_artifacts([model_artifact])

run の metrics として execution.get_output_artifacts()[0].lineage_console_uri を保存

うーん、VertexAI Metadata のリネージにモチベーションないなら素朴にに run の中で gcs にアップロードして url 記録したらいいな

https://future-architect.github.io/articles/20200626/